DIMENSION-DEPENDENT BEHAVIOR IN THE SATISFIABILITY OF RANDOM 

if-HORN FORMULAE 

GABRIEL ISTRATE* 

Abstract. We detemiine the asymptotical satisfiability probability of a random at-most-A:-Hom fonnula, via a 
probabilistic analysis of a simple version, called PUR, of positive unit resolution. We show that for k = k{n) — > oo 
the problem can be "reduced" to the case k{n) = n, that was solved in 10]. On the other hand, in the case k = 
constant the behavior of PUR is modeled by a simple queuing chain, leading to a closed-foim solution when k = 2. 
^^ ' Our analysis predicts an "easy-hard-easy" pattern in this latter case. Under a rescaled parameter, the graphs of 

^~v ' satisfaction probability con'esponding to finite values of k cnnver gp t n the one for the uniform case, a "dimension- 

dependent behavior" similar to the one found experimentally in fcoll for fc-SAT. The phenomenon is quahtatively 
explained by a threshold property for the number of iterations of PUR makes on random satisfiable Horn formulas. 
Also, for k = 2 PUR has a peak in its average complexity at the critical point. 
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1. Introduction. Finding the ground state (state of minimum energy) of a physical sys- 
tem and computing an optimal solution to a combinatorial optimization problem are intu- 
itively two very similar tasks. This simple observation, that motivated the development of 
(713 \ simulated annealing |]l9|], a simple general-purpose heuristic for combinatorial optimization, 

O ■ lies behind the recent birth of a new field at the crossroads of Statistical Mechanics, Theo- 

retical Computer Science and Artificial Intelligence, that studies phase transitions in com- 
binatorial problems (see [ [l4| ] for a readable introduction). The transfer of principles and 
^ \ methods from Physics (mainly from Spin Glass Theory [g^]) to Computer Science has al- 

^^ ■ ready been quite successful, and is responsible for a couple of interesting results, such as 

^-^ \ a better understanding of the factors that account for computational intractability |27 Pq ], 

r~^ ■ strikingly accurate predictions of the average running time of various algorithms [|ll , |2l|], or 

f^ \ of expected values of optimal solutions [p4|]. 

^D ■ The need for a rigorous validation of these insights is quite obvious. The theory of spin 

>— ^ ' glasses is a relatively young field, which still presents many heuristic, unsolved or plain con- 



X 



t/2 ■ troversial aspects (for example see [ |29| , |31| , |30| for a debate on the validity and scope of the 

' so-called Parisi solution of the Sherrington-Klrkpatrick model). Moreover, while physical 

intuition can guide the development of the theory for "physical" models, by corroborating (or 
falsifying) some of its predictions (e.g. see [p5|], for a discussion of the demise, on physical 
}J] ■ grounds, of the first formulation of the so-called replica method), such intuition is not avail- 

ed \ able when applying this type of ideas to combinatorial problems. Given that rigorous results 

are hard to come by in the case of spin glasses proper, it is not surprising that while there 
has been recently some progress (see e.g. [p3[), an analysis of most interesting combinatorial 
problems is still out of reach. 

An approach that was popular in Statistical Mechanics was to gather intuition through 
the systematic study of exactly solved models [Q|. These are "toy" versions of the original 
models that are simple to deal with, but retain much of the properties of the former ones. 
We advocate such an approach for problems in Computer Science as well, and the purpose 
of this paper is to present a (hopefully nontrivial) "exactly solvable satisfiability model" that 
displays a dimension-dependent behavior fairly similar to the one observed previously in 
various contexts such as percolation [lo], self-avoiding walks, and recently for fc-satisfiability 
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by Kirkpatrick and Selman |20|. The problem we investigate is random Horn satisfiability, 
and the "dimensionality" of a formula is taken to be the maximum length of its clauses]] 

2. Overview. There are actually two different notions of phase transition in a combi- 
natorial problem. The first of them, called order-disorder phase transition applies to opti- 
mization problems and directly parallels the approach from Statistical Mechanics. Potential 
solutions for an instance of P are viewed as "states" of a system. One defines an abstract 
Hamiltonian (energy) function, that measures the "quality" of a given solution, and applies 
methods from the theory of spin glasses p^ to make predictions on the typical structure of 
optimal solutions. In this setting a phase transition is defined as non-analytical behavior of 
a certain "order parameter" called free energy, and a discontinuity in this parameter, man- 
ifest by the sudden emergence of a backbone of constrained "degrees of freedom" [ETfl is 
responsible for the exponential slow-down of many natural algorithms. 

The second definition is combinatorial and pertains to decision problems. It relies on the 
concept of threshold property from random graph theory, more precisely a restricted version 
of this notion, called sharp threshold. A satisfiability threshold always exists for monotone 
problems [^, but may or may not be sharp (we speak of a coarse threshold in the latter case). 

The layout of the paper is as follows: in section g we review the results of Kirkpatrick 
and Selman, in particular discussing the concept of critical behavior, as well as some objec- 
tionable aspects of their results. We then define the type of dimension dependent behavior 
we are interested in, argue that it captures to a large extent the results presented in [20], and 
contrast it with critical behavior. Our results are presented and discussed in section ^while 
in section |l^ we further discuss their significance. 

Finally for k = 2, the one where the satisfaction probability has a singularity we are 
able to rigorously display another phenomenon that is believed to be characteristic of phase 
transitions: in many cases the "hardest on the average" instances appear at the transition point 
(even if we only consider satisfiable instances [|lj |l^]); this feature is quite robust with respect 
to the choice of the particular algorithm ||8|. We are able to prove that for cl particular prob- 
lem, random at-most-2-Horn satisfiability, the average running time of a particular algorithm, 
when restricted to satisfiable instances (the ones that are statistically significant on both sides 
of the critical point) is finite outside the critical point, and it diverges as we approach this 
point, thus providing some evidence for the experimental wisdom. 

3. Phase transitions and critical behavior. We first discuss, briefly and limited to our 
interests, threshold phenomena. Perhaps the best way to introduce them is through a concrete 
example. To do this, we will use one "canonical" NP-complete problem, k-CNF satisfiability. 

To generate random formulas we use a model with one parameter, the constraint density 
c, defined as the ratio between the number of clauses m and the number of variables n of 
the formula. A random formula is obtained by choosing m random clauses. If we plot the 
probability that such a random formula is satisfiable against the constraint density c, we notice 
the existence of a critical value c^ such that the satisfaction probability drops (as n — > cx)) 
from one to zero at c^. Such a "sudden change" is an illustration of the mathematical concept 
of sharp threshold, qualitatively illustrated in Figure H. The existence of a critical value Ck 
has not been rigorously established (except for C2 — 1), even though Friedgut [^] has shown 
that the transition is "sharp" for every k. 

Of special interest will also be the width of the so-called scaling window (a.k.a. critical 
region). To define it consider, for < S < 1, a_(n, S), the supremum over a such that for 
TO — an, the probability of a random formula being satisfiable is at least 1 — ^. Similarly, let 



'for technical convenience, all over the paper random k-Horn satisfiability is understood as random at-most-fc- 
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Fig. 3.1. Qualitative picture of a (rescaled) sharp threshold 



a_(_ (n, 5) be the infimum over a such that for m = an, the probabiHty of a random formula 
being satisfiable is at most 6. Then, for a within the 6-scaling window 



(3.1) 



W{n,S) = (a_(7i, (5), a+(n, (5)), 



the probability that a random formula is satisfiable is between S and 1 — S. 

We will be interested in the width of the window W{n, 5) as a function of n. It is 
generally believed that |VF(ri)| — 9{'nr^/^) for some v ~ Vk > ^ independent of 5, even 
though the existence of Vk has only been established for fc = 2 [^]. 

3.1. Order/disorder phase transitions. Statistical mechanics deals with the descrip- 
tion of systems having a large number of degrees of freedom. One of its fundamental pre- 
dictions concerns the fact that at thermal equilibrium each such state occurs with probability 
proportional to exp{—(3H{a)), where /3 is an inverse temperature, and H is a Hamiltonian 
function, describing the energy of the particular state a. The resulting distribution is called 
the Gibbs distribution Gp given by 



Pr[cr] = 



exp{-P-H{^;a)) 



where 



zm 



J2 exp{-p-H{<^-a)) 



crGfOa}" 



is the so-called partition function. 

Changes in the order properties of the system, which characterize order-disorder phase 
transitions, manifest themselves as non-analytical behavior of thermal averages (i.e. averages 
over the Gibbs distribution) of a certain order parameter. We want to emphasize that the 
physicists' use of the term order parameter would be quite different from the one from com- 
binatorics. An order parameter is a quantity that is zero on one side of the phase transition 
and becomes non-zero on the other side (for instance the satisfaction probability could be an 
order parameter). 
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One of the simplest illustrations of these concepts is the two-dimensional Ising model 
(see iQ] for a thorough treatment). In this model we have a number of spins, that are small 
magnets located on the vertices of the two-dimensional lattice, and pointing either up or down. 
The spins interact with their neighbors and with an external magnetic field h £ H, which wiU 
tend to align the spins in one of the two directions. The energy of a state a is 



Hi<7) =-^c7i-aj + h- i^cFi 



The order parameter is called free energy, is a function of temperature, and is formally 
defined as 

It measures the fraction of spins that are "frozen" when the field is turned off. 

We now briefly describe the essence of the phase transition: above a certain temperature 
Tc, the Curie-Weiss point, when the magnetic field is turned to zero the proportion of spins 
that point in each direction is about ^ (the so-called disordered phase). But for temperatures 
below Tc when we turn the field to zero some orientation still dominates (the ordered phase), 
and the proportion of spins pointing up(down) changes discontinuously as h passes through 
zero. 

The connection with combinatorial optimization follows from the observation that when 
(3 ^ oo (that is the temperature approaches K), the Gibbs distribution 0/3 converges to a 
uniform distribution G on the set of states of minimal energy (ground states). Thus, based on 
this analogy, one can hope that ideas from Statistical Mechanics are able to provide insight 
into the structure of optimal solutions to an instance of a problem in Combinatorial Optimiza- 
tion. Rather than providing a complete discussion (which would require to rigorously define 
the notion of optimization problem) we will discuss this in the context of MAX 3-SAT, the 
optimization version of satisfiability. For now it suffices to mention the three main ingredi- 
ents of an optimization problem, its instances, solutions to instances of a problem, and an 
cost function, that measures the quality of a solution for a certain instance. 

Example 1 . (MAX 3-SAT) 

Input: A propositional formula $ in conjunctive normal form, such that every clause 
has length exactly 3. 

Solution: A truth assignment a for the propositional variables in $ that maximizes the 
number of satisfied clauses. 

Cost function: The cost C{^,(7) of a truth assignment a for an instance $ of MAX 
3-SAT is the number of clauses of^ that are violated by a. 

Let Q be an optimization problem and let <& be an instance of Q "on n variables" (i.e., 
all solutions have length n). We view the set of all assignments on {0, 1}" as "states of a 
system." To each such state a we associate the Hamiltonian (energy function) 

iJ($; (t) = the cost of instance ($; a) of Q. 



Example 2. Let ^ be a 3-CNF formula, and let a be an assignment According to the 
previous definition -ff ($; cr) — C{^; a). H can be formally expressed /^q/ as 



H{^;a)^J2^ ^CM-(-ir;-3 



1=1 
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where S[i;j] — l{i=j} is the Kronecker symbol and C;_i is 1 if the kh clause contains the 
literal Xi, —1 if it contains IrJ and zero otherwise. 

For the case of problems of interest to Computer Science the instance <1> is not fixed, but 
rather is a sample from a certain distribution. This is very similar to the context of spin-glass 
theory, a subfield of Statistical Mechanics. The extra ingredient of this theory is that the 
coupling coefficients are no longer considered fixed, but are rather independent samples from 
a certain distribution. In the language of the theory of spin glasses <1> is called a quenched 
quantity). 

As in the case of the Ising model, the order parameter is the ground state free energy, 
more precise its expected value 



^— ^l-(^)' 



where (. . .) stands for the average over the random distribution of $. 

Definition 3.1. A physical (order/disorder) phase transition in a combinatorial opti- 
mization problem is a point where f is not analytical. 

Free energy has an especially crisp intuitive interpretation in the case of the problem 
MAX3-SAT[^: 

Example 3. Let $„ be an instance of MAX 3-SAT, let A be the set of optimal assign- 
ments to $„, endowed with the uniform measure /i„. Statistical Mechanics predicts that, as 
n -^ c», /i„ is "close" to a product measure on {0, 1}", /ii^„ . . . ^n,n- The free energy per 
site / is the fraction of variables Xi that are (asymptotically) fully constrained (that is fXi^n 
converges in distribution to a measure having all its weight on one of the two points 0,1. 

4. Critical beliavior and tlie mean-field approximation. An important feature that 
order/disorder phase transition share with the combinatorial notion of threshold properties 
(that are usually the type of phase transition of interest in combinatorics) is that the various 
quantities of interest, such as the satisfaction probability, the ground state energy, and the 
location of the phase transition are hard to compute. No general-purpose methods exist, and 
in some cases even obtaining good non-rigorous estimates is a challenging open problem. 

A technique that often provides realistic approximate values for these quantities came to 
be known as the mean-field (annealed) approximation. In a nutshell a mean-field approxi- 
mation assumes that we are trying to compute the average (over a certain discrete probability 
space) of a certain expression f o [gi, . . . ^g^). Then the mean field-approximation amounts 
to taking 

E[f{g^{x), . . .,gn{x)] ^ f[E[g^{x)], ..., E[gn{x)]]. 

This technical definition of the mean-field approximation does not convey a useful intu- 
ition: suppose we want to solve a combinatorial problem whose objective function depends 
on simultaneously satisfy several "constraints" whose effects are usually not independent. 
The mean-field approximation ignores the dependencies between various constraints, and 
treat them as independent. 

Example 4. Let us return to the case of spin glasses. Each configuration of spins a has 
an energy specified by a Hamiltonian H{ij). A typical expression for H{a) is 



^i^) = 51 "^i'^^'^J 
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where the a^j 's are interaction coefficients between adjacent spins (according to some 
adjacency graph specific to the considered model). The quantity of interest, average free en- 
ergy / is hard to compute directly because of the logarithmic function present in the definition 
of the free energy. In this context the mean-field approximation amounts to 

1 



/^_ ln[Z[<I>]]. 
(in 



The advantage of this heuristic is that the average on the right-hand side is one that is 
usually much easier to compute. 

For combinatorial phase transitions, the mean-field approach usually amounts to an ap- 
proximation using the so-callsd first-moment method 

Example 5 . (^-Satisfiability) 

The reason that the satisfiability probability of a random formula is hard to compute is 
that, for two assignments A, B the events A \= ^ and B \= ^ are not independent. One 
way to construct a mean-field theory for k-SAT is to ignore the dependencies between these 
events. More precisely, we have 

1sat[$] =/(5Ai [$],..., 5A2„[$]), 
where 



2" 
= 1 



f{xi,X2,...,X2^) = 1 -H 

and 

9Am ^ I '' '^^ ^ *' 

10, otherwise. 
Define 7*; = 1 — 2^*^. The mean-field approximation amounts to 

Pr[$ e SAT] - EilsATin - f{E,, [<!>],..., i?g,„ [$]) 
Since 

this reads, 

Pr[$ e SAT] - 1 - [1 - 7n^" ^ 1 - e-2""'^" = 1 - e''^^*^^^^'^^^ 
where =f(sAT [^] is the number of satisfying assignments for $. Thus (neglecting the case 

E[#SATm] = I) 

Prf$ e SAT] - / ^' '/^[#sat[$]] ^ oo, 
Pr[<^eSAT]-i Q^ ifE[#sATm]^0. 
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4.1. Critical exponents and behavior. A phenomenon that has been observed in var- 
ious contexts is critical behavior. In these cases the class of problems under study has an 
intrinsic notion of dimensionality d, and in the limit d — > oo (or sometimes even when d is 
greater than a so-called critical dimension) "the annealed approximation becomes exact". 

A way to give precise meaning to the above quote comes from the concept of universality. 
In Statistical Mechanics one define certain critical exponents, that describe the behavior of 
the system near the critical points; universality predicts that phase transitions with the same 
critical exponents are "structurally similar". 

Since critical exponents can be defined for the mean-field versions of the physical models 
too, critical behavior means that as rf ^ oo (or, sometimes, for d larger than a value called 
the upper critical dimension) the critical exponents of the d-dimensional system coincide 
with the critical exponents of the d-dimensional mean-field model. 

Example 6. (Bond) percolation on the lattice Z"*. Percolation /[72|/ is a mathematical 
theory that models the flow of liquids in random porous media. In our case the flow is on the 
lattice Ti^ of dimension d, and the model has one parameter, the edge probability p S [0, 1]. 
Each bond (grid edge of the lattice Z'^j is considered open with probability p (independently 
of the other bonds) and the order parameter is the probability Pd{p) that the origin lies in an 
infinite cluster Pd is a monotonically increasing function of p. It is believed that Pd{p) is zero 
up to a critical value Pc{d) (known rigorously only for d = 2), greater than zero beyond that 
point, and non-analytical but continuous (at least for d — 2) at Pc{d). It is also believed that 
above (and around the critical value) Pd{p) ^ {p ^ Pc{d))^ where (3 is a critical exponent 
that depends on d but not on the explicit lattice considered (i.e. it would be the same if we 
choose another d-dimensional lattice instead of 7/^). This is only one of the several critical 
exponents that are believed to structurally characterize percolation on d-dimensional lattices 
(see^). 

Without going into further details, we note that the "mean-field approximation " corre- 
sponds to considering percolation on the d-dimensional Bethe lattice, a nd the critical be- 
havior amounts to the observation that for d greater than a critical dimension (known to be 
at most 16 AO/, and is believed to be 6) the critical exponents of percolation on Z'' are those 
of percolation on the Bethe lattice. 

4.2. Rescaling and critical behavior. A recent example of critical behavior has recently 
been observed experimentally by Kirkpatrick and Selman pO| ] for satisfiability problems. 

Their results does not mention critical exponents (although it is closely related). To ex- 
plain them, we need to introduce first another concept from Statistical Mechanics: finite-size 
scaling. The intuition behind it is that [ pO[ | "sufficiently close to a threshold or critical point, 
systems of all sizes are indistinguishable except for an overall change of scale." In mathemat- 
ical terms this amounts to defining a new order parameter that "opens up" the scaling window, 
the region where the probability decreases from 1 to 0. 

Example 7. Hamiltonian Cycle. 

The random model has one parameter m, the number of edges. A random sample is 
obtained by choosing uniformly at random a set of m distinct edges of a complete graph 
with n vertices. The following result (obtained by Komlos and Szemeredi /|22|/) describes the 
phase transition in this problem: 

Let m — m{n) — ^n ■ log(n) + -^n ■ loglog(n) + c„ • n. Then 
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0, ifCn -^ — OO, 

lim Pr\G has a Hamiltonian cycle] ~ I e~'^ " , ifcn^c, 

1, ifCn -^ OO. 

A reseated parameter for the Hamiltonian cycle problem can be defined by c„ = ^ • [?7i — 
2^ • log(n) — 2^ • loglog(n)]. This parameter yields a rescaled limit probability function 

fie) = e--". 

It is important to note that, since an annealed approximation yields an expression for the 
order parameter (in our case satisfaction probability) that will usually display a phase transi- 
tion as well, a rescaled parameter can be defined for the mean-field version of the problem as 
well. 

The definition of the rescaled parameter allows a precise formulation of the intuition that 
an annealed approximation becomes exact in the limit d ^ oo. Let P^ be a class of satisfia- 
bility problems indexed by a dimensionality parameter d, let F^ be the rescaled satisfaction 
probability graph of P^, and let Fann.d be the rescaled graph corresponding to the annealed 
approximation. Kirkpatrick and Selman observe experimentally that as d ^r <x>, the function 
sequences Fd, Fann.d converge punctually to a common limit Foe- 

Example 8. We present in detail the experimental results of Kirkpatrick and Selman. 
They define an (approximate) rescaled parameter for k-SAT 

1M(C-Cfe) 
Cfc 

where c ~ m/n, Ck is the critical threshold for k-SAT, and Vk is the scaling width coefficient. 
Also, define the "annealed rescaled parameter" 

[c- Ck) 
yoo,k = n , 

Ck 

The rescaled limit probability graphs (and, see below, the rescaled versions of the mean- 
field versions) seem to converge (see Fig. 4 in that paper) to the "annealed limit" 



Definition 4.1. In this paper dimension-dependent behavior refers to the above- 
mentioned type phenomenon, convergence of the "rescaled" probability functions (and their 
annealed counterparts) to some common annealed limit. 

Observation 1. 

// is important to note that dimension-dependent behavior is at the same time more and 
less demanding than critical behavior. 

It is more demanding since it requires that the annealed approximation be exact through- 
out the (rescaled version) of the critical region. In contrast, critical exponents only provide a 
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qualitative picture of this region, rather than uniquely determine the limit probability through- 
out it; for instance the width of the scaling window v is equal to 2/3 + 7, where [3 is the 
so-called order-parameter exponent, that characterizes the asymptotic behavior of the order 
parameter close to the transition point, and 7 is called susceptibility exponent (see e.g. /|6|/j. 
It is less demanding since it does not assume the existence of critical exponents, therefore 
it makes sense for problems having coarse thresholds, including those that have no singu- 
lar/critical points. 

Why should we expect critical behavior and the above form for the annealed limit ? 
The intuition is very simple: the major difficulty in computing the probability that a random 
k — SAT formula is satisfiable is the fact that, for two assignments A and B, the events "A \= 
<1>" and "B \= $" are not generally independent, because there exist clauses of length k that 
are falsified by both A and B. On the other hand, qualitatively, as k ^ 00 clausal constraints 
become progressively "looser", so that in the limit we can neglect such correlations. 

As to the exact expression for /oo {y), for a A:-CNF formula the mean-field approximation 
implies 

Pr[$ e SAT] - (1 - 7^")2" ^ e-2"•''^^ 

But since Ck is specified (in the mean-field approximation) by E[^SAT] ~ 1, i.e. 2" • 

7^'°" ^ 1, or 1 + Cfe log2 7fc = 0, this implies that as fc -^ cxd 

_2"[l-<:/':fcl 



Pr[$ e SAT] ^e-' ' ' "' ^ foo{yoo,k). 

In other words, when plotted against the annealed order parameters yann,k the rescaled 
satisfaction probability graphs (and their annealed counterparts) punctually converge to the 
graph of /oo. 

5. Does critical behavior really exist ?. The intuitive argument sketched in the preced- 
ing paragraph seems to provide a beautiful explanation of the experimental results from [ po[ | . 
That this intuition is, however, problematic has been shown by Wilson |p4|]. First note that if 
the previous argument were true, we would have v^ ^ 1 for any large enough fc, since this 
is the width of the scaling window that the mean-field versions of fc — SAT predict. On the 
other hand Wilson presented a simple argument that implies that Vk > 2) Hence the above 
explanation is not rigorously valid. 

We stress that Wilson's observation does not rule out the existence of critical behavior: 



we, in fact, believe that the qualitative intuition that motivated |20|, that versions of fc — 
SAT become more and more "similar" as fc goes to infinity, is correct. It is the notion of 
annealed approximation that needs to be changed. And, certainly, his results do not rule the 
possibiUty that the rescaled limit probabiUties converge, as fc ^ oo, to a suitable-defined 
limit. Obtaining a rigorous example where this holds, that identifies a suitable "annealed 
approximation that becomes exact" and also obtains an explanation for this convergence, 
could hopefully offer insights on how to address this problem for random fc — SAT as well. 
This is what our theorems in the next section provide. 

6. Our results. A Horn clause is a disjunction of literals containing at most one positive 
literal. It will be caW&d positive if it contains a positive literal and negative otherwise. A Horn 
formula is a conjunction of Horn clauses. Horn satisfiability (denoted by HORN-SAT) is the 
problem of deciding whether a given Horn formula has a satisfying assignment. 

In this chapter we prove a result that displays dimension-dependent behavior for (at most) 
fc-Horn satisfiability, the natural version of Horn satisfiability studied, parameterized by the 
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maximum clause length. This problem is also of practical interest in Artificial Intelligence, 
mainly in connection to theory approximation [[l8|]. The results can be summarized as fol- 
lows: 

1. For an unbounded k ~ k{n) the threshold phenomenon is essentially the one from 
the "uniform case" k{n) — n. In particular there exists a "rescale d" p arameter that 



makes the graphs of the limit probabilities superimpose (Theorem 6.2). 
2. For any constant k the threshold phenomenon is qualitatively described by a suitably 
chosen queuing model (Theorem |5. 4). This yields a closed-form expression for the 



satisfaction probability when k — 2 (Theorem 6.3). This expression has a singularity 
(though fc = 2 is likely the only case that does so). 

3. The rescaled hmit probabilities from the cases when fc is a constant converge to 
the one from the "infinite" case, that can in turn be seen as the result of a mean-field 
approximation (thus the problem displays what we have called dimension-dependent 
behavior). 

4. Somewhat surprisingly, the explanation for this convergence (an intrinsic feature of 
the problem) is a threshold property for the number of iterations of PUR (a particular 
algorithm) on random satisfiable Horn formulas "in the critical range." 

5. In the case when k — 2 PUR displays an "easy-hard-easy" pattern for the average 
number of iterations on satisfiable instances, peaked at the point where the limit 



probability has a singularity (Theorem 5.6) 



Note, however, the important difference between random fc-SAT and random at-most-fc- 
HORN-SAT: for every fc > 2, fc-SAT has a sharp threshold [|]. All versions of HORN-SAT 
have coarse thresholds. 

Definition 6.1. Let k — k{n) : N —> N foe monotonically increasing, 1 < k{n) < n. 
We define the following random model n{k, n, m): formula $ on n variables is obtained by 
selecting (uniformly at random and with repetition) ni clauses from the set of all (non-empty) 
Horn clauses in the given variables of length at most k{n). 

The following are our results (whose proofs are only sketched): 

Theorem 6.2. If k[n) —> oo, c > 0, H^n) is the number of Horn clauses on n 
variables having length at most k{n), and m{n) — c ■ '''"' then 

(6.1) poo(c) := lim Pr$eo(fc(„),«,m)(* e HORN-SAT) = 1 - F,{e-'=). 



Theorem 6.3. Ifc > 0, and F2 : (0, 1) -^ (l,oo), F2{x) = lnx/{x - 1), then 
(6.2)p.(c) := ^P.,,,(,,„,„,(a> e HORN-SAT) = { ^.^^c/S), filjle. 



More generally, define A^ = -^ and 5j = (^) + (*)+... + ('.) (with the usual 
convention (*) = for i < j). Then 
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RescAled order parameter 




Fig. 6.1. Reseated threshold functions 

Theorem 6.4. The limit probability pk{c) := lim„_+oo ^''$ef2(fc,n,c-n'=-i)('^ G 
HORN-SAT) is equal to the probability that the following Markov chain ever hits state zero: 



(6.3) 



Qo = l, 



-i = Q,-l + Poic-\k-Sl+_\), 



To get a better intuition on the threshold phenomenon, as displayed by Theorems p.2| , 

5.3 and p^ we have plotted (in Fig. 1) the limit probability functions P2i'),P3i'),PoDi'), 
against the "rescaled" parameter (inspired by Theorem |5.2|) c — jp'" . This rescaling has the 



pleasant property that it simplifies the factor Afc from the right-hand side of |6.3|, in particular 
mapping the critical point in Theorem 6.3 to c = 1. The graphs of p2 (continuous) and poo 



(dashed) are obtained from their formulas in the previous results, while p^ (dotted) is obtained 
via simulations. The figure makes apparent that the graphs of p2,P3, ■■■,■■ ■ converge to the 
graph of Poo. This statement can be proved rigorously : 

Theorem 6.5. For every c > 0, lim„^ooP„(c) — Poo(c). 

As a bonus our analysis yields the following result: 



Theorem 6.6. Let q be the limit of the expected number of iterations o/PUR on a 
random formula $ G J7(2, n, en), conditional on $ being satisfiable. Then 



(6.4) 



1— P2A2C ' -' '2' 

00, Otherwise. 



This theorem suggests (see Fig. 2) and explains the "easy-hard-easy" pattern for the av- 
erage running time of PUR in this case. Experiments we performed confirm this prediction. 
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Fig. 6.2. Th£ "easy-hard-easy" pattern. 



7. Preliminaries. Throughout this paper we use "with high probability" (w.h.p.) as 
a substitute for "with probability 1 — o(l)". We denote (sometimes abusing notation) by 
B{n,p){Po{X)) a random variable having a binomial (Poisson) distribution with the cor- 
responding parameter(s), and by a—b the value max{a — b, 0). We will use the following 
version of the Chernoff bound 

Theorem 7.1. 7/0 < 6* < l/Athen Pr[\B{n,p) - np\ > 0np] < e^"^^. 

as well as the related inequahty from [g] : 

Proposition 7.2. Let P have Poisson distribution with mean ji. For e > 0, 

Pr[P < ^ • (1 - e)] < e^'•^/^ 

Pr[P >^l■{l + e)]< [ef{l + e)-(i+'^]''. 



We also use the following inequality: 

Proposition 7.3. Let k e'N andp G [0, 1]. Then far every n> k 

(7.1) 1-E (1)^^(1-^)- ^0/- 
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Proof: Define / : [0, 1] ^ R, f{p) = 1 - YlLo (")p'(l " P)"~' " {l)p''- ^^ is easy to see 
that f'{p) = n{22i)p'^^^[i^ — p)"^'^ — 1] < 0, therefore / is monotonically decreasing, and 
/(O) = 0. D 



We will also employ couplings of Markov chains (see \vM) to assert stochastic domina- 
tion. The following is the definition of the type of coupling we employ in this paper: 

Definition 7.4. Let {Xt)t and {Yt)t be two Markov chains on Z. A coupling of X and 
Y such that Xt < Yt is a Markov chain Z — {Zt.ii Zt.2) such that: 

• Zt^i is distributed like Xt given Xq. 

• Zt^2 is distributed like Yt given Yq. 

• for every i > 0, Zi^\ < Zi^i- 

We use such couplings to bound the probability that a Markov chain Yt ever decreases 
below a certain value a by coupling it with a chain Xt such that Xt < Yt and using the 
estimate Pr[Elt '■ Yt < a] < Pr[3t : Xt < a] (that follows from the coupling). The couplings 
we construct employ the following ideas: 

• Suppose the recurrences describing AXt and AYj are identical, except for one term, 
which is i3(TOi, t) in Xt and B{m2, t) in Yt, where mi < 7712 are positive integers 
and T e (0, 1). Obtain a coupling by identifying B{mi , r) with the outcome of the 
first nil Bernoulli experiments in B{m2, t). 

• Suppose now that AXt and AYt differ by exactly one term which is B{m^ p) in AXt 
and B{m, q) in AY"*, p < q. Let Ai and Bi, i — l,m, be independent 0/1 experi- 
ments with success probabilities p and |^ respectively. Define the pair (Zj 1, ^4,2) 
so that 

1. Zt^i is the number of times Ai succeeds. 

2. Zt.2 is the number of times at least one of Ai and Bi succeeds. 

We measure the distance between two probability distributions P and Q by the total 



variation distance, denoted by dTv{P, Q), and recall the following results, (see |32| and \^, 
page 2 and Remark 1.4): 

Lemma 7.5. If n,p, X, jj, > then dTv{B{n,p),Po{np)) < mm{np^,-^} and 

dTviPoiX),Po{^l))<\^l-x\. 

We will also need the following simple lemma: 

Lemma 7.6. Let c be a fixed positive integer For every i £ N /ef ^t, Vt be two proba- 
bility distributions. Define the Markov chains {Xt)t and {Yt)t by recurrences 

/7 9x J Xt+l^ Xt-C + ^t^ 

^ ' 1 Yt+^ = Yt-c + i^f 

Then, for every t > 0, drviXt^Yt) < dTv{Xa,Yo) + J^IZq dTvi^i^Vi)- 
Proof. 

The following result gives a more convenient inequality that immediately implies 
Lemma |7.d 
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Lemma 7.7. Let c be a fixed positive integer. Let X, Y, ^, rj be random variables with 
nonnegative integer values. Define the random variables Z and T by recurrences 

(7.3) ir?^'^:^' 

Then, for every drv {Z :T) < dTviX,Y) + dTv{^,v)- 

Proof. 

To prove this result, we will denote (for the "generic" rv. A) by Ai the probability that 
A takes value i. We also employ the following simple inequality, valid for a, b,c,d > 0: 
\ad — bc\ < a\d — c\ + \a — b\c. 

For every a > we have: 



Za — 2_^ ^i^a + 2_^ ^i^a 



c+a 

+c— i) 



i=0 i=c+l 



Ta = ^ YiTja + ^ Y^rja 



c+a 



i=0 i=c+l 

Applying the above-mentioned inequality and summing we get: 
dTv{Z,T) 

^ C CO c oo 

c-\-a oo c-\-a oo 

+ X! ^ ^i\^c+a-i - Vc+a-i\ + ^ ^\Xi-Yi\lJc+a-i}- 

i—c-\-l a—0 i—c-\-l a—0 

Let A,B,C,D be the four terms of the sum. By simple algebraic manipulations we obtain: 






and the result follows. D 

Finally, we need the following trivial occupancy property: 

Lemma 7.8. Let a white balls and b black balls be thrown uniformly at random in n 
bins. 

L if r — niax(a,6) — o{n}''^) then the probability that there is a bin that contains 

both white and black balls is at most ^^— = oil). 
2. if s = iiiin(a, 6) = Lu{n^''^) then the probability that there is a bin that contains 
both white and black balls is 1 — o{l/poly). 

Proof. The first part is easy: the probability that two balls (of any color) end up in the same 
bin is at most {^^ ) ' 'k- ^^^ ^^ second part, let A be the event that no two balls of different 
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colors end up in the same bin, and let B the event that at least ^/n bins contain white balls. 
We have: 



But 



and 



Pi[A] <Pi[A\B] +Pi[B]. 



•■'i^^l "^)-(7h'"^ ""'"'■" -''<;5;>' 



Pt\A\B] < (1 - -^)'' - e-''/^ = o(^). 

V" poly 



D 



The algorithm PUR is displayed in Figure 3. We regard PUR as working in stages, 
indexed by the number of variables stiU left unassigned; thus, the stage number decreases as 
PUR moves on. We say that formula $ survives Stage t if PUR on input <i> does not halt 
at Stage t or earlier Let $; be the formula at the beginning of stage i, and let Ni denote 
the number of its clauses. We will also denote by Pit (^i,t), the number of clauses of ^t of 
size i and containing one (no) positive literal. Define $f( ('fff) to be the subformula of ^t 
containing the clauses counted by P^ f {Ni^ 



The following lemmas were proved in | ]17| |, in the context of analyzing the behavior of 

PUR on <f> e Q{n, n, m), m = c- 2". 

Lemma 7.9. 

1. Suppose PUR does not halt before stage t. Then, conditional on Nt, the clauses of 
$( are random and independent. 

2. Suppose now that we condition on Vt = (-^i,t, -^^2,4, -Pi,t, P2,t ond on the fact that 
<i> survives Stage t as well. Then we have 



(7.4) 7Vt_i=7Vt-Ai^p(i)-A2,p(i), 

where 

• Ai p(t), the number of positive clauses that are satisfied at stage t, has the 
distribution 1 + B {P\^t ~ Ij j)- 

• A2^p(t), the number of positive non-unit clauses that are satisfied at stage t, 
has the binomial distribution B (P2,t, j)- 

Lemma 7.10. For every c > and every t,n~Cy/n^ < t < n, the conditional probability 
that the inequality 



(7.5) A^„ - (n - t) 



^ ^ 2(iV„ - 1) 



< Nj < Nn 



t 

holds for all t < j < n, in the event that PUR reaches stage t, is 1 — o(l). 
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Program PUR($) : 

if $ (contains no positive literal as a clause) 
then return TRUE 
else 

choose such a positive unit clause x 
if ($ contains a; as a clause) 
then 

return FALSE 
else 

let $' be the formula 
obtained by setting a; to 1 
return PUR($') 

FlG.lA. Algorithm PUR 

Lemma 7.11. Let Xn G [0,n\ be the r.v. denoting the number of iterations of P\]K 
on a random satisfiable /ormM/o $ £ J7(n,c • 2"). Then Xn converges in distribution to 
a distribution p on [0, n] having support on the nonnegative integers, p — {pk)k>o, Pk = 
Proh[p = k], given by 



-2"c 



pfc = r^^^ ■ nV - e"''^)- 



l-f(e-) ^^^ 



8. The proof of Theorem |6.2| . Let ci < C2 < C3 be arbitrary constants. Consider three 
random formulas $1 e ri(r7,, k(n), ci • —^), $2 G rj(n, n, C2 -2") and $3 G r2(n,k(n), €3- 
— ^^^), and let $' be the subformula of $2 consisting of the clauses of size at most k{n). 
By the Chernoff bound, with high probability, m! , the number of clauses of $', is in the 
interval [ci • ^^^,03 • ^^^]. When n ^ 00 the probabiUty that $2 e HORN-SAT tends 
tol-i^i(e-'=4. 

From Lemma [7.1l| we infer the following easy consequence 

Claim 1 . The probability that PUR accepts $2 after stage n — k(n) + 1 is o(l). 

Since in the first k(n) — \ stages of PUR only the clauses of^' can influence the algorithm 
acceptance/rejection of $2 (because PUR accepts/rejects at Stage i based only on the unit 
clauses, and each non-simplified clause loses at most one literal at each phase), 

I Pr[$2 e HORN-SAT] - Pr[$' e HORN-SAT] | = o(l). 

By the monotonicity of SAT and the randomness of $1 , $2 , 'f we have 

Pr[$i e HORN-SAT] - o(l) < Pr[$2 e HORN-SAT] < Pr[$3 e HORN-SAT] + o(l). 

Taking limits it follows that 

IW„^^ Pr$eo(„,fc(«),ciff,,„,/„)[* e HORN-SAT] < 1 - F(e-^^) < 
lini„^ooPr$eo(„,fc(n),c3i/.,„,/n)['J' e HORN-SAT]. 
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Since ci , C2 , C3 were chosen arbitrarily, by choosing ci — c^c-2 — c+e,andc2 = c— e,C3 — c, 
respectively, we infer that 

1 - fi(e-(-^)) < lim_.oPr*eo(n,M«),cH.,„,/„)[* ^ HORN-SAT] < 
n^„_o,Pr$en(„,fc(n),cffM.)/«)[* ^ HORN-SAT] < 1 - Fi(e-(^+^)). 
As e is arbitrary, we get the desired result. □ 



Observation 2. One point about the previous proof that is intuitively clear, but gets 
somewhat obscured by the technical details of the proof is that if ^2 G ri(n, n, C2 • 2") then 
<I> behaves "for every practical purpose" as if it were a uniform formula in U,{n, k(n), C2 • 
— ^i2i). We will use a similar intuition in the proof of Proposition 



9. The uniformity lemma. The following lemma is the analog of Lemma |7.9| for the 

case k = 2, and the basis for our analysis of this case: 

Lemma 9.L Suppose that <1> survives up to stage t. Then, conditional on 
{Pl.t, Ni.t, P2.t, N2.t), the clauses in <i>f j, $ft, ^2ti ^2t '^''^ chosen uniformly at random 
and are independent. Also, conditional on the fact that $ survives stage t as well, the follow- 
ing recurrences hold: 



(9.1) 



Pl,t- 


-1 


= pl,t 


-1-AG 


+ Af2,t, 


Ni,t- 


-1 


= A^i.t 


+ Af2.t, 




P2.t- 


-1 


= P2,t 


- Af,, - 


A^ 

^02,*' 


N2,t- 


-1 


^N2,t 


-Al. 





where (in distribution) 



(9.2) 



Af, -S(Pi,i-l,lA), 

Af2,t=i?(P2,t,lA), 
A^2.t=^(^2,t-Af2,„l/t), 

[ Af2,t^B{N2,u2/t). 



Proof. A formula will be represented by an to x 2 table. The rows of the table correspond 
to clauses in the formula and the entries are its literals. They are gradually unveiled as the 
algorithm proceeds. We assume that when generating $ we mark those clauses containing 
only one literal (so that we know their location, but not their content). We say that a row (or a 
clause) is "blocked" either if the clause is already satisfied or the clause has been turned into 
the empty clause. Suppose PUR arrives at stage ton $. Then in stages i — n, n—1, . . . , t+1, 
^i should contain a unit clause consisting of a positive literal but should not have contained 
complementary unit clauses of the same variable. To carry out the disclosure at stage i, let x 
be the variable set to one in this stage. We assume that the formula unveils all occurrences of 
a; or 3f in $. For each clause we perform the following: 

1 . if it contains x we unveil all its literals and block; 

2. otherwise we do nothing. 
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The clauses of $t having size two correspond to the rows of $ that contain no unveiled hteral. 
The clauses of size one are either the clauses of size one in $ that contain none of the chosen 
literals, or the clauses of size two that contain the negation of one chosen variable and another 
is yet to be chosen. Given these observations the uniformity and independence follow from 
the way we construct $. 

To prove the recurrences, let x be the variable set to one in stage t (it exists since PUR 
does not halt at this stage). By uniformity and independence, each of the Pit — 1 positive 
unit clauses of $t, other than the chosen one, is equal to x with probability 1/i (since there 
are t variables left at this stage). On the other hand, the positive unit clauses of <I>f_i that are 
not present in $4 can only come from clauses of size two of $4 that contain x and a positive 
literal (therefore counted by P2,t )- U niformity and independence imply therefore that Af (i) 
has the distribution claimed in (9.2). The other relations can be justified similarly (noting 



that, since PUR does not reject at this stage, every negative unit clause of <i>f is also present 
in$t_i). 

It will be useful to consider the Markov chain ( |9.l[ ) for all values of i = n, . . . , (even 



when the algorithm halts). To accomplish that, the "minus" signs in the first equation of (9.1 ) 
and the definition of Af ^ should be replaced by — . We also need to specify the distribution of 
each component of the tuple {Pi^n, ^i,n, P2.n, -^2,ri)- Let A„ be a random variable having 



the Bernoulli distribution B(cn, 



2n 



2n+3(2) 



). It is easy to see that in distribution 



(9.3) 



Pi,„ = S(A„,l/2), 

^l,n = A„ — Pl,„, 

P2,n = S(cn-A„,2/3) 

N2,n = Cn- A„ - P2.n- 



D 



10. Proof of Theorem 

stages" Aff 

tively behaves like the Markov Chain {Qt)t defined by 



The main intuition for the proof is that in "most interesting 



and A(^ j is approximately Poisson distributed. Therefore, Pi t qualita- 



(10.1) 



fi = l, 

-i-Qt-l + Po(A), 



where A = 2c/3. This explains the closed form of the limit probability: a well-known result 
states that p, the probability that the queuing chain Qt reaches state 0, satisfies the equation 



p = $(p), where ^{t) = e 



A(«-l) 



is the generating function of the arrival distribution Po(A). 



We will define a suitable value luq such that: 

1. With high probability PUR does not reject in any of stages n, . . . ,n — cjq- 

2. PUR accepts "mostly before or at stage n— wq" (i.e. the probability that PUR accepts 
after stage n — ujq, given that $ survives this far is o(l)). 

3. With high probability, for every t & n, . . . ,n ~ luq, Af j = 0. 

4. At stages n, . . . ,n — ujq, Pi_t is "very close" to Qt, with respect to total variation 
distance. 

This program ca n be accomplished as described if c < 3/2. To prove Property 4 we 
make use of Lemmas 7.5 and 7.8. Property 2 is proved only implicitly: in this case (see [15 1) 
the probabihty that Qi — for some i tends to one, and, in fact, by a technical result due to 
Frieze and Suen (Lemma 3.1 in [ttO|), Pr[Qi — for some i > n ~ logn] is 1 — o(l). 
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Let us now concentrate on the case when c > 3/2 (the case when c — 3/2 will follow 
by a monotonicity argument). In the previous argument we only used the fact that c < 3/2 
when deriving the probability that Qt hits state 0, hence the arguments from above carry on, 
and the conclusion is that the probability that PUR accepts at one of the stages n, . . . , n — wq 
differs by o(l) from the probability that Qt — somewhere in this range. We now, however, 
have to consider the probability that PUR accepts at some stage later than n — uiq and aim to 
prove that this probability is o(l). It is conceptually simpler to divide the interval [n — luq, 0] 
into two subintervals, [n — loq^u — oji] and its complement, such that w.h.p. ^n-uii (if 
defined) contains two opposite unit clauses, therefore the probability that PUR accepts after 
stage n — cji is o(l). In the range [n — wq, n — wi] we would like to prove that "most of 
the time" Af ^ is zero and Pi t is "close" to Qt and to reduce the problem to the analysis 
of Qt- Unfortunately there are two problems with this approach: although the probability 
that each individual Af j > is fairly small, to make $„-tJi unsatisfiable w.h.p., bJi has to 
be (^{s/n). This implies that we cannot sum these probabilities over [n — u)o,n — uji] and 
expect the sum to be o(l); a similar problem arises if we want to sum the upper bounds for 
dTy(Af2^„Po(A)). 

Fortunately there is a way to circumvent this, avoiding the use of total variation distance 
altogether: although we cannot guarantee that w.h.p. each Af ^ = 0, we can arrange that 
w.h.p. for every sequence of p consecutive stages i,t— l,...t — p+1, Af j + Af j_i + . . .+ 
^it-p+i ^ 3 (*). Intuitively, in any sequence of p consecutive steps at most p + 3 clients 
leave the queue, and the number of those who arrive is the sum of p approximately Poisson 
variables, thus approximately Poisson with parameter p\. Choosing p large enough so that 
A > 1 + - ensures that in any p steps the average number of customers that arrive is strictly 
larger than the number of customers that are served in this time span. Therefore we will seek 
to approximate Pit by a queuing chain Q^. with this property. Since Pi,„_uj(, = Qn-uio i^ 
"large," an elementary analysis of the queuing chain implies that the probability that Qj hits 
state in the interval [n — wq, n — wi] is exponentially small. So we obtain the desired result 
if Q^ is constructed so that it is stochastically dominated by Pit- 

10.1. The case c < 3/2. Define loq — n°'^. The following are the main steps of the 
proof in this case: 

Lemma 10.1. With probability \ — o{l/ poly) for every t ^ [n, . - . ,n/2\ we have 

A P A P AN < 1 0.1 
^12,t7 ^02.t; ^12,t — r, 



Proof. Use the coupling with toi = P2.t{N2,t), '^2 = en, r = 1/t, and apply Chernoff 
bound to P(cn, 1/t). D 



Corollary 1. Consider lj < n/2. If for every t G [n, . . . , n/2], Af2 j, Afj.ti ^I2,t ^ 
Sn^-^ then, for all t(^ K . . . , n- w], Pi^t, TVi,*, |P2,t - P2,„|, lA^a^t - iV2.nI < {n-t)-n^-\ 



Lemma 10.2. If for all tc, [n, . . - ,n - uj], Pi^t,Ni,t,\P2.t - P2.nlW2,t ~ N2,n\ < 
{n — t) ■ n'^'^ holds then w-h.p. Aff. — Ofor every t G [n, . - . ,n — CiJo]- 



20 



G. ISTRATE 



Proof. Pr[B(Pi,, - 1, 1) > 0] = 1 - Pr[S(Pi,t - 1, ]] 



0] = 1 - (1 - i)^M-i 



Pi, 



:i < n-o-9. 



Lemma 10.3. VK/i./?., |P2,„ - |cn|, |iV2,„ - ic7i| < n' 
Proof. Directly from the Chernoff bounds on A„ and P2 



0.6 



D 



Lemma 10.4. If the events in the conclusions of Lemmas [7| and /Q.3| hold for ui — ujq, 
ei = 1/6 flnc/ £2 = 0.1, then there exists a constant r > such that for every t = n, . . . ,n — 



Proof. We have 



t 



7C| < P2.t 



1 1 

t n 



\P-?,j - P: 



2,t — -f^2,n 



P2 



<P2 



LUq 



,0.2 



,0.6-1 



n(n — ujq) n 
by Lemma 10.4 and n — ojq < t < n, and the result immediately follows. D 



Lemma 10.5. If the conclusions of Lemmas 10.4 and 10.2 are true then 



J2 dTviPi,t,Qt) = o{l/iUo). 

t—n—ujQ 



Proof. By Lemma 10.4 and the inequalities on total variation distance there exist ri , r2 > 
such that 



drviAPpoiX)) < drv Afa.t,^^ 



P2,t 



< ri — \- r2n ' < r^n 



dTv{Po{ ^Ypofr^c 



where r^ = n + r2. Employing Lemma [7^ it follows that 

n n 

Y, dTv{PiA,Qt)<r:i Y, tn-''-^<r3n- 
and this amount is o{1/ujq). 



-0.4 '^O 



D 



Observation 3. The probability that the conditions in the previous lemma are not 
fulfilled is at most lOq/u — n^'^'^. Indeed, the events that ensure the applicability of the 
previous lemma are: 
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P AP an ^ 1^0.1 



1. for every te[n,...,n/2],A',^t,A^^,,Af^,< 



n 



2. for all t E [n, . . . ,n — ujq], Af^. = 0, and 

3. |P2,«-|cn|,|7V2,n-icn|,<nO-6 

The first and the third events have probability 1 — o{\/poly) (as they come from applying 



Chernoff bounds). The second fails (for a specific t) with probability at most ^^ < LLj'^/{n — 
loq), so its total probability is at most loq ■ LO^Kn — loq). Both terms can be absorbed into 

Wo/"- 

Lemma 10.6. If the event in LemmaUjholds then w.h.p. PUR does not reject at stage t, 
for every t in the range n, n — 1, . . . ,n — loq, given that $ survives up to this stage. 



Proof. To prove Lemma 10.6 we show that, with high probability the unit clauses of each 
<l>t involve different variables. This can be seen as follows: consider Pi j + iVi t balls to 
be thrown into t urns. The probability that two of them arrive in the same urn is at most 
( ^'^ ^ ') ' 7- '^^^^ ^^ upper bounded by 2in-u) \ ■ Summing this for t — n, . . . ,n — loq 
yields an upper bound, which is o(l). D 



The proof for the case c < 3/2 follows easily from these results: with probability 1 — o(l) 
all the events in Lemmas 10. 1|, 111 |10.2, 10.3, 10.5, and 10.6 take place, therefore PUR does 



not reject at any of the stages n to n ~ ujq and Pit is close to Qt in the sense of Lemma |10.5 , 
Therefore the probability that for some t in this range Pit = (i.e. PUR accepts) differs 
by o(l) from the corresponding probability for Qf. But according to the result by Frieze and 
Suen [|l^ this latter probability is 1 - o(l). 

10.2. The case c > 3/2. Define wi = nP'^^. The following are the auxiliary results we 
use in this case: 

Lemma 10.7. Let A = n"'®^. For every k > there exists a constant c^ > such that 
for every r > the probability that there exists t £ [n — ujQ,n — uji], Af^ + Aff_^ + . . . + 

^i.t-r+i ^ k is at most Ck{i^i — uJo){rA/n)''. 

Proof. By Corollary IT] we can assume that Pit < A. Then for every i, 

Pr[A(^t >i]^ Pr[B{Pi,t -!,-)> i]< Pr[B{A, -) > i] 



A\ fiy f i^^-' 



— SQliJl'-. 



The event Af j + ^f t-i + . • . + Af f_^^i > k happens when: 

• one of the factors is at least k, or 

• one of the factors is at least fc — 1, and another one is at least 1, or 
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at least k of the factors are at least one. 



(a finite number of possibilities). Applying the previous inequality, and taking into account 
that r, k are fixed immediately proves the lemma. 

To flesh out the argument outlined before we construct a succession of Markov chains 
running along Pit, that provide better and better "approximations" to Qf.. Our use of indices 
will be slightly nonstandard (to reflect the connection with Pi,t), in that the sequence of 
indices starts with n — uiq and is decreasing. 

Definition 10.8. Let X^-ujo = Yn-u^o = ^n-t^o = Qn-u^o = ^i,"-'^o ^nd 



Xt-i =Xt~{p + 3)xpz+iin - Wo - t) + A( 



p 



2,ti 



,,^-. . Yt^i=Yt-{p + 3)xpZ+iin-iJo-t) + B{P2,n-u.,,l/t), 

C1U.2) <; Zt-i^Zt-{p + 3)xpZ+i{n-LOo-t) + B{P2,n-.j„^), 



Let c = Pr[(3i E [n — ujo, n — u!i]) : Pit = 0]. Note that the amountp + 3 is subtracted 
from Xt,Yt, Zt exactly once in every p consecutive steps, so whenever the condition (*) is 
satisfied it holds that Xt < Pi,t for every t G [n — u!o,n — uji]. By coupling Af2 j(= 
B{P2,t, 1/^)) with B{P2^n-Loi, 1/i) we deduce that we can couple Xt and Yt so that Yt < 
Xt- We can also couple Yt and Zt such that Zt < Yt. Finally, notice that we can couple 
Zn-co^^jp and Q„_^„_j(p+3) such that (3„_^„_j(p+3) < 2'„_^„_jp. So an upper bound on 
a is Pr[(3i G [0, n — ojq]) : Qt — 0]. With high probability the Bernoulli distribution in the 
definition of the chain Qt has the average strictly greater than one, (because the flow from 
P2,t is approximately Poisson), and Q.^^^^ — il{u!o), therefore, by an elementary property 
of the queuing chain, the probability that Qt hits state is exponentially small. This yields 
the desired conclusion, that a = o(l). 

One word about the way to prove the fact that ^n-uji is unsatisfiable (if defined): one 
can prove that w.h.p. both Pi.n^uii and Ni^n-uii are 0(wi). By the uniformity lemma ^T| 
we are left with the following instance of the occupancy problem: there are Pi.n-uji white 
balls, A^i^n-cji black balls and n — oji bins. The desired fact now follows from the second 



part of Lemma 7.8 



11. Proof of Theorem S.4. Theorem 5.4 is proved along lines very similar to the proof 



of Theorem 6.3 . The basis is the following generalization of Lemma 9.1: 



Lemma 11.1. Suppose that $ survives up to stage t. Then, conditional on the values 
{Pi,t, Nit, ■ ■ ■ , Pk,t, Nk^t), the clauses in <&f j, ^iti • ■ • i ^kt^ ^kt ^^^ chosen uniformly at 
random and are independent. Also, conditional on the fact that $ survives stage t as well, 
the following recurrences hold: 



> 

l.t- 


-1 


= ^i,t 


-1-A 


P -1- A^ 

31. t "T ^12, ti 


h.t 


-1 


^Ni,t 


+ Af2, 




> 


1 


= P^X - 


- Ao^,, - 


AC_i).. + A 


h,t- 


-1 


^N,^t 


-^U 


+ A^ 



fori = 2,k, 
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where (in distribution) 



(11.2) 



/Sf - A^ - 



Proof. 

The uniformity condition and the justification of the recurrences are absolutely similar 
to the ones from Lemma |6.3[ The additional technical complication is that now there is a 
"positive flow into P2,t , ^2.t-" n 



Lemma 11.2. With high probability it holds that 



P,, = (l + o(l)).-.A,.z.(J.5^+/-*, 



n \i 



c . li\ „n+l-t 



and 

for every i > 2, and uniformly on t — n — o(n). 

Proof. 

Let us first heuristically derive a formula for Xi^t, Vi.t, the expected values of Pi^t, Ni^t, 
obtained by replacing the binomial distributions in the equations by their expected values. 
We have: 



'-^^••'-' ^ iVi.t . ('+l)i/(. + l).t 



, y.t-i = y^,t - 4^ + (^±%±iM, for ^^2,k, 
Rearranging terms the recurrences become 



(11.4) 
Also, 



Xi^t-i = Xi^t{l- j) +Xi+i^tl,foTi = 2, fc, 
yt,t-i =2/j,t(l- f) + y(j+i),t^^, fori = 2,k. 



Xi 



(11.5) \ ""''" ^^ 






-G^-cAfe-^ = ^Afc-("). 

/i fc '^ n n '^ V z / 



A simple induction shows that these expected values are Xi,t = :f ' '^fe ■*■(*) ■ '5'^-/ *' ^"d 

The concentration property can be proved inductively, starting from i = k towards 3, 
by noting that the expected values of the binomial terms in the recurrence are LLj{n), hence. 
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by the Chernoff bound, the probabihties that they significantly deviate from their expected 
values is exponentially small. 

Almost the same argument holds for P2.t and for -/V2,t). The only amounts to be 
handled differently are "the clause flows out of P2,t,A^2,t," but they are approximately 



Poisson distributed, hence "small" with high probability by Proposition 7.2. Therefore 



The previous lemma implies that Aj f ~ Po{c ■ X^ ■ S'^_2 ) (f°'' t = n — o(n)y, thus 
in this range Pi^t-i ~ Pi,t — 1 + Po{c ■ Xk ■ Sk^2~*)- The proof follows exactly the same 
pattern as in the case c < 3/2 for k = 2: the conclusion for the stages [n, n — llIq] is that the 
probability that Pit is zero somewhe re in this range differs by o(l) from the corresponding 



probability for the queuing chain in ( p.3[ ). The fact that the stages after [n, n — wo] have a 
contribution of o(l) to the final accepting probability can be seen by the fact that there is 
possible to couple the Markov Mi, describing the evolution of PUR on a random fc-SAT 
formula, and M2 that runs on the 2-CNF component of the formula, such that for every t 
we have P^ ^ < Pii ■ Perhaps the most intuitive way to see this coupling is to "paint" the 
initial clauses of the formula having size at most two in red, and the other clauses in blue. At 
every step t P^ ^ will count only red clauses having unit size at step i, while P^ ^ will count 
clauses of both colors. 

Given the stochastic domination, the desired result follows from the corresponding proof 
in the case fc = 2. D 



12. Proof of Proposition 6.5. The idea of the proof is to consider PUR on a random 
at-most-fc-Horn formula $ with c • — clauses and prove that there exists a function </)(/;;) 

with lim^,^oo 0(fc) — such that 

lim Pr[PUR accepts in at least k steps ] < (/'(fc). 

n — »oo 

Indeed, from the previous proof it follows that lim„^oo Pr[PUR accepts in > fc steps ] 
satisfies the recurrence: 

xt+i=xi^t-l + Po{c-Sit\^''), 

where 

2^0 = Pl,k > 1- 

We define 0(fc) to be the probability that the sequence in the recurrence ( [l2| ) hits zero. Triv- 
ially limfe_foo S^^l — 00, so the expected values of the Poisson distributions in ( jl2[ ) can be 
made larger than any given constant A. Using the fact that the sum of two Poisson distribu- 
tions with parameters a and b has a Poisson distribution with parameter a + & it follows that, 
for large enough fc, one can couple xt with the queuing chain 

yt+i =2/i.t - l + Po(A), 



J/0 
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such that yt < Xt- It follows that, for large k, (f>{k) < Pr[ the chain j/j hits state zero]. Since 
A was arbitrary, it follows that linifc^oo (j){k) = 0. 

Now consider a random uniform Horn formula $ with c • — clauses, and let <1> be its 
subformula consisting of clauses of size at most k. It is easily seen that the behavior of PUR 
on the first k — 1 steps depends only on the clauses of ^, so 

Pr[PUR accepts <1> in less than k steps] = Pr[PUR accepts <1> in less than k steps]. 

On the other hand we have 

< Pr[PUR accepts $ in at least k steps] < Pr[PUR accepts $ in at least k steps]. 

The fact that "$ is close to a random formula in fl{n,k,c ■ -^)" (see the discussion in 
Observation 0) implies that the right-hand side term can be made less than any fixed constant 
e (for n, k big enough). It follows that 

1 Pr[PUR accepts $] - Pr[PUR accepts $]] < 2 • e, 
for large enough values of n, k. This immediately implies the desired result. n 



13. Proof of Theorem S.6. Theorem 5.6 is based on the proof of the Theorem 5.3 and an 



elementary property of the queuing chain Qt (the expected time to hit state zero, conditional 
on actually hitting it has the desired form). 

The crucial point is to prove that the probabilities that any of the conditions we have 
employed in our analysis fails have a negligible effect on the running time. 

This is easy to see for stages smaller than n~ ojq: since the probabilities that the various 
steps of the analysis are either exponentially small or can be made o{l/n) (by choosing a 



large enough k in Lemma 10.7, the probability that Pi ( hits state zero after stage n — wq is 
0(1/71), therefore its influence on the average running time of PUR is o(l). The corresponding 
observation is not true for stages before n — wo, but these stages can be handled directly, using 



the statement from Lemma 10.5 



D 



14. Random Horn satisfiability as a mean-field approximation. What we have 
shown so far is to prove that (under a suitably rescaled picture) the rescaled probability graphs 
for random at-most-fc Horn satisfiability converge to the graph for random Horn satisfiability. 
To be able to argue that our results display critical behavior, we have to be able to show that 
this latter probability poc, is indeed the one predicted by some mean-field approximation. 

In the sequel we will show that this is indeed the case. However the mean-field approx- 
imation is not the one from [ pO| ] , and incorporates a correction specific to the properties of 
random Horn satisfiability. 

Let us first see that it is not accurate if no correction is taken into account. Indeed, were 
it true we would have 

lim Pr[<P e HORN-SAT] = 1 - lim TT (1 - Pr[A h *]) • 

Ae{o,i}" 

Since, for an assignment A of Hamming weight i there are exactly 2' — 1 + (n — i) • 2* Horn 
clauses that A falsifies, we have 
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Pr[^ 1= $] = 1 



(n + 2) • 2" - 1 



so the mean-field prediction reads 



lim Pr[$ e HORN-SAT] = 1 - lim fT ( 1 - ( 1 ^^ ^^ ' ^^ 



(n + 2) • 2" - 1 
j=o \ ^ ^ ^ 



c-2 



"X^) 



1 — f 1 — |-„~^2V^,._\ ) I has limit 0, the mean-field prediction would imply that 



All terms in the product are less than 1. Since the term corresponding to j = 1 

c-2" 

lLL ■ 

(n+2)-2'' 

lim„^oo P^^ e HORN-SAT] = 1. On the other hand let us observe that, if we do not 
consider the power (") in the infinite product we obtain the right result: it is a simple but 
tedious task to prove that 



c-2"\ oo 

t^oolil V (n + 2-2"-l / / J-J-V 



Intuitively this means that "there exist a correction of the mean-field approximation that 
only considers a single assignment of each weight, and is accurate." The following simple 
result gives a precise statement to the above intuition: 

Lemma 14. 1 . Suppose $ is given as a union of formulas $i, . . . , $„, where $i contains 
all clauses of length exactly i. Then there is a set T = {Tq, . . . , T„_i} of assignments, with 
Ti of Hamming weight exactly i and depending only on $i U . . . U $i+i, such that <1> is 
satisfiable if and only if it is satisfied by some assignment in T. 
Proof. 

Let yi . . .yk denote the assignment that makes yi ^ . . . — yk — 1, and all the other 
variables equal to zero. 

The set T has two parts: the first is simply the set of assignments implicitly examined 
by the algorithm PUR in testing satisfiability. That is, if xi , . . . , Xk are the variables assigned 
by PUR in this order, the first part includes the assignments 00000, xi, . . . , xi, . . . , xt- The 
second part contains a random assignment for each remaining weight. D 

The result has a "mean-field" interpretation: as before, define /(xi, . . . , a;„) = 1 — HlLi ^«' 

and the function gk [<&] to be the indicator function for the event "Tfe ]^ $, given that event 

A„ A . . . A An-k+i happens," i.e. 

r^i _ 1 _ f 1, ifTfc l^$AA„A...A:4„_fe+i 

^''^ Prp„ A ... A An-k+i] ' I 0, otherwise. 

We have 



S[gfe[$]] = Vr[An-k\An A ... A An-k+i]- 

Indeed, gk[^] ^ exactly when i?„ V . . . V Rn-k+i or Tfe ^ $ A 5'„ A . . . Sn-k+i- The 
second event is equivalent to An-k A S'n A . . . Sn-k+i, hence we have gfe[$] ^ exactly 
when An-k A ^„ A ... A An-k+i holds. 



DIMENSION DEPENDENT BEHAVIOR OF RANDOM HORN SATISFIABILITY 27 

Thus we have, by the discussion in the previous chapter, 

n 

/(^[gi[$]], . . . ,S[g„[*]]) - l-H Pr[Ar;|:4„A. . .AA„_fc+i] = Pr[$ G HORN-SAT]. 

fc=0 

The above correction seems to be specific to the random model for Horn satisfiabiHty, 
which allows clauses of varying lengths. 

To sum up: the mean-field approximation is true, modulo a correction that takes into 
account some particular features of the random model for Horn satisfiability. 

15. Discussion. We have characterized the asymptotical satisfiability probability of a 
random /c-Horn formula, and showed that it exhibits very similar behavior to the one uncov- 
ered experimentally in pct l . 

We have also displayed an "easy-hard-easy" pattern similar to the ones observed ex- 
perimentally in the AI literature. In our case the pattern is fully explained by elementary 
properties of the queuing chain. 

As for an explanation of the "critical behavior", consider an intermediate stage i of PUR 
and let Cj be the set of clauses of $f . It is clear that whether PUR accepts is dependent only 
on the number of clauses in Ci . The restriction on the clause length acts like a "dampening" 
perturbation (in that it eliminates the "clause flow into Cfc"). The proof of Theorem 6.2 states 
that when k{n) —* oo, with high probability PUR accepts (if $ is satisfiable) "before the 
perturbation reaches Ci", therefore the satisfiability probability is the one from the uniform 
case. On the other hand, for any constant k, with probability greater than PUR does not halt 
during the first k iterations (for the exact value see [|l7]|), and the dampening has a significant 
influence. Thus the explanation for the occurrence (and specific form of) critical behavior is 
a threshold property for the number of iterations o/PUR on random satisfiable Horn formulas 
"in the critical region" . 

A related, and somewhat controversial, open issue is whether random Horn satisfiability 
properly displays critical behavior Problems with a sharp threshold display "critical" (i.e 
singular) behavior at least in one parameter, the satisfaction probability, which conceivably 
allows the definition of critical exponents. This is not so for random fc-Horn satisfiability, 
that has a coarse threshold, and no criticality for fc > 2, hence the question seems not to 
be meaningful. Note, however, that the order parameter involved in the recent study of the 
phase transition in 2-SAT Q is not satisfaction probability, but the (expected size) of the 
so-called backbone (or its more tractable version spine) of a random formula. The "window" 
that we use to peek at the threshold behavior of random Horn satisfiability does not seem to 
be "naturally required" by any physical considerations, and it is possible in principle that the 
random Horn formulas display critical behavior if we take the spine as the order parameter. 
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